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Abstract 

(N 

Many artificial intelligence (AI) problems naturally map to NP-hard optimization problems. 
(— | This has the interesting consequence that enabling human-level capability in machines often re- 

O ! quires systems that can handle formally intractable problems. This issue can sometimes (but 

! 1 „ possibly not always) be resolved by building special-purpose heuristic algorithms, tailored to 

the problem in question. Because of the continued difficulties in automating certain tasks that 
are natural for humans, there remains a strong motivation for AI researchers to investigate and 
a PPly new algorithms and techniques to hard AI problems. Recently a novel class of relevant 
i— — i algorithms that require quantum mechanical hardware have been proposed. These algorithms, 

referred to as quantum adiabatic algorithms, represent a new approach to designing both com- 
plete and heuristic solvers for NP-hard optimization problems. In this work we describe how to 
^ — , formulate image recognition, which is a canonical NP-hard AI problem, as a Quadratic Uncon- 

strained Binary Optimization (QUBO) problem. The QUBO format corresponds to the input 
format required for D-Wave superconducting adiabatic quantum computing (AQC) processors. 

O 1 Introduction 

oo 

Humans currently have substantial performance advantages over machines in several areas, includ- 
ing object recognition, knowledge representation, reasoning, learning and natural language process- 
^ ing BRN031 . Intruigingly, most of the hard problems arising in these areas can naturally be cast as 

NP-hard optimization problems, with the majority reducible to pattern matching problems such as 
maximum common subgraph [Smi99l IEV07I iBunOOl |BDK + Q8[ ISin02fl . The formal intractability 
of most problems associated with human intelligence is at the heart of the continued difficulties AI 
researchers face in mimicking or surpassing human capabilities in these areas. 

It may seem surprising that capabilities that we take for granted and perform quite easily could 
be computationally intractable. However it is important to remember that this intractability does not 
preclude efficient generation of approximate solutions. In practice, exact solutions to optimization 
problems arising in AI are not required. Generally there is a graceful degradation of performance 
as a solution moves away from global optimality. Because of this behavior the ideal computational 
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Figure 1 : Object recognition by image matching proceeds by pairing points in two images that cor- 
respond to the same structure in the outside world. In the algorithms considered here, both feature 
similarity and geometric consistency are considered in determining to what extent two images are 
similar. 



approach is to use specialized heuristic algorithms to attack these problems ISim95l . It is inter- 
esting to note that human brains are thought to contain structures specialized for pattern matching 
('wetware heuristics') that are used to support a variety of capabilities for which humans still hold a 
performance advantage over machines, and that these structures have been used as inspirations for 
development of successful heuristic algorithms BSin021 |Mou971 |Mac9 11 . 

In this article we focus on the quintessential pattern recognition problem of deciding whether 
two images contain the same object. This is a typical example of a capability in which humans 
outperform modern computing systems and can be thought of as an NP-hard optimization problem. 
We begin to explore whether quantum adiabatic algorithms HEFS001 ICFGGOOl |BBTA99l ISMTC021 
can be employed to obtain better solutions to this problem than can be achieved with classical opti- 
mization algorithms. The first step in this exploration is to map image recognition into the particular 
input format required for running quantum adiabatic algorithms on D-Wave superconducting AQC 
processors. 

2 Image matching 

A popular method to determine whether two images contain the same object is image matching. 
Image matching in its simplest form attempts to find pairs of image features from two images 
that correspond to the same physical structure. An image feature is a vector that describes the 
neighborhood of a given image location. In order to find corresponding features two factors are 
typically considered: feature similarity, as for instance determined by the scalar product between 
feature vectors, and geometric consistency. The latter is best defined when looking at rigid objects. 
In this case the feature displacements are not random but exhibit correlations brought about by 
a change in viewpoint. For instance, if the camera moves to the left we observe translations of 
the feature locations in the image to the right. If the object is deformable or articulate then the 
feature displacements are not solely determined by the camera viewpoint anymore but one can 
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Figure 2: Representation of images as labeled graphs. Shown are three exemplary interest points 
for each image. The number of interest points detected is content dependent but is on the order 
of several hundred for 640x480 resolution images with content as shown. Each interest point is 
assigned a position, scale, and orientation HLow99i In the figure the scale is indicated by a circle 
and the orientation by a pointer. This information can be used to characterize the relative pose and 
position of two interest points denoted by the vectors g next to the dotted lines. 



still expect that neighboring features tend to move in a similar way. Thus image matching can 
be cast as an optimization problem in which one attempts to minimize an objective function that 
consists of two terms. The first term penalizes mismatches between features drawn from image one 
and placed at corresponding locations in image two. The second term enforces spatial consistency 
between neighboring matches by measuring the divergence between them. It has been shown that 
this constitutes an NP-hard optimization problem [FH05]. 

3 Mapping Image Matching to a Quadratic Optimization Problem 

D-Wave AQC processors take as input problems of the form 



which are typically referred to as Quadratic Unconstrained Binary Optimization (QUBO) problems. 
Physicists will recognize this objective as being closely related to the Ising energy function. In order 
to use D-Wave hardware to run quantum adiabatic algorithms, the problem of interest must first be 
converted to this format. In this section we will describe how to perform this conversion for image 
matching. 

3.1 Representing an image as a labeled graph 

As a first step towards casting image matching problems as QUBO minimization, it is convenient 
to reduce the amount of data and to focus on features that are sufficiently unique and robust under 
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small image transformations that finding a correspondence is well denned. This reduction to salient 
image structures is accomplished with an interest point operator. Many versions of interest point 
operators have been described in the literature [MS04, M TS + 05ll . In our implementation we use a 
Laplacian of Gaussians operator. Each interest point % is described by a normalized feature vector fi 
called a local descriptor. We employ a new local descriptor called CONGAS that is based on Gabor 
wavelets of varying scale and orientation drawn from a space variant grid around an interest point 
[BN08][WFKvdM97]. Each pair of interest points is associated with a quantity gij that describes 
the geometric relationship between the interest points, g^ measures the translation between interest 
points as well as the difference in local scale and orientation. The quantity gij is normalized for 
global translation, rotation and scaling. Using these concepts, any image can be represented by a 
labeled graph G (see Fig. 2). 

3.2 Generating a conflict graph Gc from two image graphs G\ and G 2 

Images 1\ and I2 give rise to collections of interest points. We associate the M interest points of I\ 
with vertices in a labeled graph G\ having M nodes. The i th vertex in G\ is labeled with the feature 
vector fi. Edges in G\ between feature points i and j are labeled with gij. Edges characterize 
the geometric relationships between feature vectors. Similarly, image I2 is represented as a labeled 
graph G2 over N features (vertices). In this representation the similarity of two images is specified 
by the similarity of the two labeled graphs. 

For pairs of interest points drawn from each image (i £ G\, a € G2) we calculate the similarity 
of the associated feature vectors d(i, a) = df eat (fi, f a ). Note that when using subscripts we will 
use the convention that Latin subscripts refer to G\ and Greek subscripts to G2. d(i, a) is a measure 
of the similarity of the interest points i and a. A common choice to measure the distance between 
two feature vectors, which we assume to be normalized such that \d(i, a)\ < 1, is the scalar prod- 
uct. Of all of the potential matches between interest points in G\ and G2, some will be excellent 
(corresponding to d(i, a) close to one) and some will be poor. As we do not want to keep poor 
matches, we introduce a point- wise inclusion threshold Tf eat . We define a pair of points (i, a) to 
be a potential match if d(i, a) > Tf eat . Increasing Tj eat decreases the number of potential matches 
i.e. it increases the standard for what constitutes an acceptable point-wise match. 

3.3 Generation of vertices in the conflict graph G c 

We generate a conflict graph Gc from G\ and G2 to measure the similarity of the two graphical 
representations of the images. Starting from the largest d(i, a) we add a vertex Vi a to Gc until either 
all potential matches have been included or the number of vertices reaches a hardware dependent 
limit L. Vertex V{ a corresponds to an association of feature % in G\ with feature a in G2. 

3.4 Generation of edges in Gc 

Edges (i, a; j, (3) in Gc encode geometric consistency between feature vectors fi and fj in G\ and 
feature vectors f a and fp in G2 ■ For all vertex pairs (Vi a , Vjp) in Gc where i ^ j and a//3we 
calculate d(i, a, j, 0) = d georn (gij,g a p). This quantity we call the geometric consistency of the two 
pairs of interest points, and is normalized such that \d(i, a,j, 0)\ < 1. It measures the geometric 
compatibility of the match pairs (i, a) and (j, j3) as the residual differences in local displacement, 
scale and rotation assignment of the associated interest points after changes due to global translation, 
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Figure 3: Image matching via determining the maximum independent set (MIS) of a conflict graph. 
The MIS problem is a special case of the more general QUBO problem class. In the MIS formulation 
each vertex V{ a in the graph represents a match candidate for which the feature similarity exceeded 
a threshold Tf eat . An edge is placed between any two matches that are geometrically inconsistent. 
The MIS represents the largest set of matches that are geometrically consistent. 



rotation and scaling have been normalized out. A pair (i,j) and (a, (5) are not allowed to match 
if they are in geometric conflict i.e. if the residual differences are too large. This requirement is 
enforced by choosing a threshold T geom whereby if d(i, a,j, (3) < T geom the pair (i, j) and (a, (3) 
are considered to be in geometric conflict. Thus, the prescription for edge drawing in Gc is as 
follows: 

• For all pairs of vertices (Vi a , Vjp) in Gc, draw an edge between the vertex pair (Via, Vjp) if 
% = j or a = /3 

• For all pairs of vertices (Vi a , Vjp) in Gc, draw an edge between the vertex pair {Via, Vjp) if 
i j and and (Via, Vjp) are in geometric conflict (ie. d(i, a,j, (3) < T geom ) 

In either case the presence of an edge records a conflict in associating i with a and j with 0. Note 
that the first condition ensures that two features which are separate in one image are not be mapped 
onto a single feature in a second image. 

3.5 Structure of GC 

The graph generated by the preceding prescription we call a conflict graph. It has by construction 
at most L vertices, and has arbitrary connectivity with at most L(L — l)/2 edges. The maximum 
independent set (MIS) of the conflict graph is equivalent to the maximum common subgraph of 
unlabeled graphs G\ and G2. The MIS provides both a similarity measure (the larger the MIS, 
the greater the region of mutual overlap) and the largest conflict-free mapping of features in Gi to 
features in Gi- For example if the MIS of Gc is the set {Vi a , Vjp, Vfc 7 } then the point- wise matches 
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between the two graphs are i «-> a, j «-> /?, and <-> 7. Finding the MIS for the conflict graph 
Gc can be cast as a QUBO by setting Qi a> i a = —1 for all vertices and Qi a jp = L whenever there 
is an edge between (i, a) and (j, j3). The minimum energy configuration enforces Xi a = 1 if and 
only if Vi a 6 M IS and Xj Q = otherwise. More elaborate objectives defining the correspondence 
between images can be chosen. Those would not necessarily constitute an MIS problem but could 
still be formulated as a QUBO. 

4 Summary and Next Steps 

This article presents a description of how to map image recognition problems into the input format 
required for using D-Wave superconducting AQC processors. This represents a first step in deter- 
mining to what extent quantum adiabatic algorithms may be useful as components of novel solvers 
for the NP-hard optimization problems underlying image recognition. 

Quantum adiabatic algorithms are to date largely theoretical and verification of their utility as 
components of either complete or heuristic solvers, and their actual performance in either regime, 
awaits experimental verification [LI0O8 ]. The next step in our reporting of this work will be a de- 
tailed description of the results of solving QUBO instances generated by image matching problems 
using D-Wave superconducting AQC hardware. 
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